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BACKGROUND 


In  response  to  inquiries  from  Congressional  representa¬ 
tives,  the  Acting  Assistant  Secretary  of  Defense  (Health 
Affairs)  requested  that  the  Army  document  a  Department  of 
Defense  (DoD)  position  regarding  an  extension  of  the  Fort  Bragg 
Mental  Health  Demonstration  Project.  It  was  requested  that  the 
Army  establish  a  panel  of  Army/DoD  experts  (psychiatrists, 
psychologists,  other  clinicians,  and  clinical  statisticians)  to 
review  the  evaluation  and  other  related  data  concerning  the 
Demonstration  Project  in  order  to:  (1)  support  a  DoD  position 
on  the  level  of  confidence  necessary  to  confirm  treatment 
results/conclusions,  and  (2)  indicate  the  impact  of  an  Army 
approved  evaluation  due  date  on  that  level  of  confidence. 

This  technical  report  presents  an  independent  statistical 
analysis/review.  No  actual  data  from  the  Fort  Bragg 
Child/Adolescent  Mental  Health  Demonstration  Project  or  the  Fort 
Bragg  Evaluation  Project  were  made  available.  However, 
information  contained  in  a  letter  (shown  as  Appendix  A)  written 
by  Dr.  Lenore  Behar,  Ph.D.,  Head  of  the  Child  and  Family 
Services  Branch,  North  Carolina  Department  of  Hiiman  Resources, 
to  Mr.  Leo  Sleight,  Central  Contracting  Office,  Department  of 
the  Army,  Headquarters  U.S.  Army  Health  Services  Command,  Fort 
Sam  Houston,  Texas,  dated  February  15,  1993,  was  provided  by 
Vanderbilt  University.  In  the  letter.  Dr.  Behar  presented  two 
data  collection  plans.  These  plans,  one  Short-Term  and  one 
Long-Term,  differ  in  the  number  of  cases  collected  at  'Wave  3'. 
The  effectiveness  of  each  plan  was  described  by  means  of  a  power 
value  of  a  statistical  test  for  detecting  differences  in 
improvement  in  mental  health  outcomes  between  Demonstration  and 
Comparison  cases.  In  addition,  a  reprint  of  a  paper  submitted 
to  the  1992  American  Psychological  Association  Convention 
addressing  power  analysis  in  psychotherapy  research  was 
furnished.  This  paper  is  included  as  Appendix  B.'  Also 
submitted  was  documentation  supporting  the  power  values  in 
Appendix  A  in  materials  attached  to  a  letter  dated  April  30, 
1993,  written  by  Dr.  Leonard  Bickman,  Ph.D.,  Director  of  the 
Center  for  Mental  Health  Policy,  Institute  for  Public  Policy 
Studies,  Vanderbilt  University,  to  LTC  Thomas  E.  Leonard, 
Headquarters  U.S.  Army  Health  Services  Command,  Fort  Sam 
Houston,  Texas.  Pertinent  portions  of  this  documentation  are 
included  as  Appendix  C. 


POWER  ANALYSIS  COMPARISON 
OF  TWO  DATA  COLLECTION  PLANS 


Power  Analysis  Assumptions. 

In  the  statistical  assumptions  presented  in  Appendix  A, 
the  type  of  variable (s)  used  to  measure  'improvement'  between  an 
average  Demonstration  case  and  an  average  Comparison  case  was 
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not  defined.  The  paper  shown  in  Appendix  B  was  referenced 
instead,  presenting  the  results  of  a  meta- analysis  for  12 
categories  of  outcome  measures,  six  each  for  behavioral  and 
nonbehavioral  treatments.  It  appears  that  the  Fort  Bragg 
Evaluation  Project  used  the  Appendix  B  paper  to  obtain  the  value 
of  the  effect  size  (ES)  for  Normed  Rating  Scales- -Nonbehavioral 
Treatment  outcome  measures- -as  this  value  is  included  in 
Appendix  A.  In  Appendix  A  (p.  A-6),  it  is  stated  that  the 
Short-Term  Plan  has  50%  power  and  the  Long-Term  Plan  of  data 
collection  would  have  80%  power.  These  levels  of  power  were 
based  on  a  simulation  model  submitted  by  Vanderbilt  University 
(Appendix  C) . 

The  effect  size  (ES)  index  identified  as  d  by  Cohen 
(1988) is  the  standardized  difference  between  two  population 
means.  This  equation  is  as  follows: 


a 


where  d  =  ES  index  for  t  test  of  means, 
m^,  mg  =  population  means, 

and  a  =  standard  deviation  of  either  population 
(equal  variance  is  assumed) . 


The  effect  size  value  (ES  =  0.25)  derived  in  Appendix  B  (p.  B-2) 
and  cited  in  Appendix  A  (p.  A- 5)  should  be  used  with  caution  for 
several  reasons.  First,  this  value  was  computed  for  a  series  of 
12  sub-group  samples.  The  Normed  Rating  Scale  used  to  derive 
the  power  in  Appendix  A  was  based  on  a  mean  sample  of  only  33 
cases.  The  authors  of  the  Appendix  B  paper  stated  this  problem 
of  variability  as  follows  (p.  B-2) :  "The  large  discrepancies 
between  Scunple  sizes  actually  used  and  those  necessary  to  attain 
an  acceptable  level  of  power  in  the  studies  shown  in  Table  1 
make  it  difficult  to  assess  how  closely  the  obtained  treatment 
effect  sizes  represent  true  population  effects.  This,  in  turn 
underscores  the  need  for  researchers  to  attend  to  power 
considerations  when  planning  therapy  outcome  studies."  When  a 
meta- analysis  is  based  on  such  a  small  size  the  probability  of 
error  is  high.  As  a  result,  the  mean  effect  size  (ES  =  0.25) 
used  in  Appendix  A  may  or  may  not  express  score  distances  (in 
units  of  variability)  for  the  actual  variables  measuring  health 
outcome  in  the  Fort  Bragg  Evaluation  Project. 

Secondly,  there  is  always  a  risk  that  meta-analysis  may 
have  employed  inappropriate  assumptions  with  regard  to  the 
validity  of  pooling  and  generality.  For  instance,  the  meta¬ 
analysis  may  contain  some  bias  as  to  how  the  outcome  should  be 
produced,  excluding  some  relevant  trials  from  analysis.  In 
other  instances,  meta-analysis  may  use  multiple  results  from  the 
same  study,  and  because  the  results  are  not  independent  they  may 
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bias  or  invalidate  the  meta- analysis.  In  other  cases,  the 
independent  studies  may  include  different  measuring  techniques 
and  definitions  of  variables,  so  the  outcomes  may  not  be 
comparable.  In  general,  effect  sizes  in  unique  areas  are  likely 
to  be  small  (ES  =  0.20  or  ES  =  0.30),  but  only  a  pilot  test 
would  give  an  answer  as  to  the  probable  magnitude  of  the  ES 
index  for  the  particular  variable  of  interest  in  a  particular 
situation. 

The  power  and  sample  size  tables  (Cohen,  1988)^  for  the 
above  specified  ES  =  0.25  in  Appendix  A  are  designed  to  analyze 
the  difference  between  means  of  two  independent  samples  of  the 
same  size  drawn  from  normal  populations  with  equal  variances 
(using  the  t  test  for  means) .  If  these  assumptions  cannot  be 
made,  which  is  often  the  case,  the  additional  adjustments  that 
follow  are  explicitly  supported  by  Cohen  (1988)^  and  others. 
Computations  should  be  performed  to  obtain  the  harmonic  mean  if 
samples  of  different  sizes  but  equal  variance  are  present,  and 
the  root  mean  square  should  be  computed  if  two  samples  of  the 
same  size  having  unequal  variances  are  present.  If  both  sample 
sizes  and  variances  differ,  the  values  for  power  formulas  from 
the  tables  cited  in  Appendix  A  may  not  be  valid. 

Since  no  actual  data  were  available  from  the  Fort  Bragg 
Evaluation  Project,  this  review  will  utilize  the  data  used  by 
Vanderbilt  University  for  this  analysis.  Appendix  A  contains  a 
comparison  of  the  two  data  collection  plans  using  power 
analysis.  The  Appendix  A  power  analysis  comparison  presents  the 
number  of  cases  after  attrition  for  both  the  Short-Term  and 
Long-Term  Plans  (p.  A-6) .  For  the  Short-Term  Plan,  299 
Demonstration  cases  and  150  Comparison  cases  were  expected.  The 
following  power  analysis  is  based  on  Cohen's  formulas  and  uses 
the  information  supplied  in  Appendix  A.  This  analysis  is 
followed  by  a  discussion  of  the  simulation  submitted  by 
Vanderbilt  University  and  included  as  Appendix  C. 


Power  Analysis  of  Short  and  Lonu-Term  Plans. 

Under  the  assximption  that  the  variances  in  the 
Demonstration  and  Comparison  sites  are  equal,  the  harmonic  mean 
(n)  of  the  Demonstration  sample  size  (n^)  and  the  Comparison 
sample  size  (nc)  is  given  by  the  formula  (Cohen,  1988) 

n  =  =  2  (299)  (150)  ^  89,700  ^  200 

+  lie  299  +  150  449 


The  value  for  power  of  the  t  test  of  the  Demonstration  case  mean 
(m^)  and  the  Comparison  case  mean  (1%)  testing  the  null 
hypothesis  that  1%  »  me  at  oc,  -  o.05  (one-tailed  test)  (Table 
2.3.2  from  Cohen,  1988)®  gives  the  following  results: 
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for  n  =  200  and  ES  =  0.20,  power  *  0.64,  and 
for  n  =«  200  and  ES  =  0.30,  power  =  0.91. 

The  effect  size,  proposed  in  Appendix  A  and  derived  from  a  meta¬ 
analysis  performed  in  Appendix  B,  is  0.25.  A  linear 
interpolation  was  performed  to  derive  the  power  of  the  t  test 
for  ES  =  0.25.  This  computation  yielded  a  power  of  0.78  for  ES 
=  0.25,  oc,  =  0.05  and  n  =  200.  This  power  of  0.78  (78%)  ,  as 
computed  for  the  Short-Term  Plan,  is  much  higher  than  the  0.50 
(50%)  quoted  in  Appendix  A.  A  full  precision  computation  of  the 
power  for  the  Short  and  Long-Term  Plans  is  presented  in  the  next 
section  of  this  report. 

The  Long-Term  Plan  projects  426  Demonstration  cases  and 
361  Comparison  cases.  This  harmonic  mean,  computed  under  the 
assumption  that  the  variances  are  the  same,  is  as  follows 
(Cohen,  1988) 


_  SnpHc  ^  2 (426) (361) 
Tljj  +  426  +  361 


307,572 
7  87 


=  390.8  =  391. 


Employing  Table  2.3.2  in  Cohen  (1988),*  n  =  350  yields  power  = 
84%  for  ES  =  0.20  and  power  =  99%  for  ES  =  0.30.  For  n  =  400, 
power  =  88%  for  ES  =  0.20  and  power  is  greater  than  99%  for  ES  = 
0.30.  The  linear  approximation  yields  a  power  of  93.3%  for  ES  = 
0.25  (for  n  =*  391)  . 


Computational  Procedure  for  the  Exact  Power 
of  the  Short  and  Iiono-Term  Plans. 

The  linear  interpolation  to  compute  power,  discussed  on 
pages  3  and  4,  was  justified  by  its  simplicity  and  by  the 
relatively  accurate  values  obtained.  The  full  precision  in 
computing  the  power  for  the  Short  and  Long-Term  Plans  was  based 
on  the  expression  (Cohen,  1988) 


^  _  _ d(n-l)>/2n _ 

2(n-i)  +  l.2l(Zi.,^  -  1.06) 


where 

d 

and  n 


the  percentile  of  the  standard  normal 
distribution  giving  the  power  value 
the  percentile  of  the  standard  normal 
distribution  for  significemce  level 
the  effect  size  ES 
the  harmonic  mean. 
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For  the  Short-Term  Plan,  the  following  information  was 
available: 


n  -  200 
(Xi  *  0.05 
d  =  0.25 
7.^,  =  1.645. 


The  z,.^  percentile  was  computed  under  these  assumptions  from  the 
above  formula: 


r  =  (0.25)  (200  -  l)v^2  (200) 

"  2  (200  -  1)  +  1.21(1,645  -  1.06) 

(0.25) (199) (20)  _  995 


-  1.645 


398  +  (1.21) (0.585) 


-  1.645  = 


398.708 


-  1.645 


=  2,496  -  1.645  =  0.851. 


The  probability  for  this  z,.^  percentile  was  found  from  the 
Normal  Curve  Areas  Table  C  (Daniel,  1988)  This  probcdsility 
presents  the  power  of  the  test  and  is  equal  to  80.258%.  The 
Short-Term  Plan  gives  a  statistical  power  (computed  with  full 
precision)  exceeding  80%. 

A  similar  computation  was  performed  for  the  Long-Term  Plan 
under  the  following  assumptions: 

n  =  391 
a,  =  0.05 
d  =  0.25 
z,^  =  1.645. 

The  z,.^  percentile  found  from  the  same  formula  (Cohen,  1988)'* 
was  computed  as  follows: 


=  (0.25)  (391  -  1)V'(2)  (391) 

2(391  -  1)  +  1.21(1.645  -  1.06) 


1.645 


_  (97.5)  (27 .964) 
780  +  0.70785 


1.645 


2,726.516 

780.708 


1.645 


=  3.492  -  1.645  =  1.847. 


The  power  for  this  value  of  z,.^  found  from  the  Normal  Curve 
Areas  Table  C  (Daniel,  1988)'*  is  equal  to  96.78%. 


The  power  analysis  shown  above  projects  that  the  number  of 
cases  in  the  Short-Term  Plan  is  currently  sufficient  to  draw 
statistically  significant  conclusions  with  high  statistical 
power  (80.258%).  An  additional  reason  for  this  conclusion  is 
found  by  using  the  sample  size  tables  provided  by  Cohen  (1988)'^ 
and  deriving  the  sample  size  necessary  to  achieve  full  80% 
power.  Sample  size  tables  provide  data  for  two  homogeneous 
normally  distributed  populations  from  which  random  samples  of 
the  same  size  were  derived.  The  ES  specified  in  Appendix  A  is 
0.25.  This  ES  level  is  not  tabulated  by  Cohen  (1988).'^ 
Therefore,  to  find  the  sample  size  for  an  untabulated  effect 
size,  the  following  formula  is  used  (Cohen,  1988) 


+  1 

lOOd^ 


where  n jq  is  the  sample  size  for  desired  power, 
given  <x  and  ES  =  0.10, 
and  d  is  the  effect  size. 


In  addition,  if  the  sample  sizes  are  not  equal,  one  sample  size 
is  treated  as  if  fixed,  while  the  other  is  computed.  When  the 
choice  is  arbitrary,  it  is  generally  supported  that  ric  be  fixed 
and  no  be  computed.  To  find  no,  the  following  formula  is  used 
(Cohen,  1988):“ 


ngn 

2nc  -  n 


where  n^ 
n 

and  np 


fixed  sajiple  size  (Comparison  sites), 
value  read  from  the  Table  2.4.1  (Cohen, 

1988)^^  or  computed  from  the  previous  equation, 
sample  size  for  the  Demonstration  site. 


With  the  objective  to  determine  the  Demonstration  case 
sample  size  required  to  yield  a  power  =  80%  with  oc,  =  0.05  and 
ES  =  0.25,  and  fixing  the  Comparison  cases  at  r  =  150  (the 
current  level),  the  formula  for  computing  n  is: 


n 


n 


.  10 


lOOd^ 


+  1 


100(0.25)2 


6.25 


198  +  i 


199  . 


Source:  Table  2.4.1  (Cohen,  1988).'^ 


Next,  this  value  is  put  into  the  formula  for  no: 


,  ^  ^  (150)  (199)  ^  29,850 

2nc  -  n  2(150)  -  199  300  -  199 


^  29,850 
101 


=  295.54  =  296 . 


Consequently,  296  Demonstration  site  patients  are  needed  to 
assure  an  80%  power  for  the  test  investigating  the  difference  in 
mental  health  outcomes  between  Demonstration  and  Comparison 
patients  (299  were  projected  in  Appendix  A) . 

The  identical  procedure  was  applied  to  the  Long-Term  Plan. 
Given  that  the  Comparison  sites  consist  of  361  cases,  and 
assuming  the  same  conditions  (oc,  =  0.05,  ES  =  0.25,  power  = 

0.80),  a  sample  size  of  138  cases  for  the  Demonstration  site  was 
obtained: 


n  =  - +  1  =  199 

100d2 

I  =  ^  (361)  (199)  _  71,839  _  71,839 

2nc  -  n  2(361)  -  199  ~  722  -  199  "  523 

=  137.36  =  138. 


As  proposed,  in  Appendix  A,  the  Long-Term  Plan  is  projected  to 
produce  426  Demonstration  cases.  Using  Vanderbilt  University's 
information  talcen  from  Appendix  A,  the  above  analysis  computes 
only  138  cases  are  statistically  necessary  to  achieve  80%  power. 


Assessment  of  the  Simulation  Method. 

Vanderbilt  University's  use  of  the  Monte  Carlo  simulation 
method  to  perform  a  power  analysis  in  the  present  situation  is 
an  inappropriate  application  of  this  type  of  simulation.  Using 
simulation  to  compute  the  power  analysis  without  any  information 
about  the  actual  data  is  not  an  appropriate  use  of  either 
simulation  or  power  analysis.  Concerning  simulation.  Miller  and 
Starr  (1969)*’  state: 

" . . .Simulation  ia  not  a  substitute  for  knowledge 
[etnphasis  by  authors]  .  This  cannot  be  over- 
en^hasized.  Simulation  is  not  a  method,  which, 
somehow,  condensates  for  lack  of  knowledge . " 
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In  general,  simulation  should  be  treated  as  a  technique  of  "last 
resort"  (Naylor,  1971) to  be  used  only  when  analytical 
techniques  are  not  available  for  obtaining  solutions  to  a  given 
model.  Power  analysis  gives  the  correct  probability  of  getting 
a  significant  result  of  Comparison  and  Demonstration  site  means 
only  when  the  effect  size  is  computed  precisely  (i.e.,  based  on 
actual  data  from  actual  variables  in  the  experiment  under 
consideration) . 

The  use  of  simulation  requires  complete  information  about 
the  process  or  object.  In  order  to  simulate  reasonably,  the 
probability  distributions  of  the  variables  of  interest  should  be 
known.  If  these  distributions  are  not  known,  it  is  impossible 
to  simulate  the  process.  This  position  is  strongly  emphasized 
by  many  authorities  in  operations  research  (Naylor;  Ignizio  and 
Gupta;  Buff a;  Smith;  Banks  and  Carson;  Gibra;  and  Miller  and 
Starr)  It  is  critical  that  estimates  of  parameters  of  the 
simulation  model  be  derived  on  the  basis  of  observations  taken 
from  the  actual  data.  Naylor  (1971)”  states: 

"...  There  is  very  little  to  be  gained  by  using  an 

inadequate  model  to  carry  out  simulation  experiments 

on  a  computer  because  we  would  merely  be  simulating 

our  own  ignorance." 

Since  the  Monte  Carlo  technique  presented  in  Appendix  C  does  not 
involve  actual  data,  the  results  obtained  from  this  method  may 
be  entirely  misleading  and  not  accurate.  The  simulation  shown 
in  Appendix  C  is  based  on  assumptions  regarding  the  effect  size 
(ES  =  0.25).  This  value,  derived  from  meta-analysis  (Appendix 
B,  p.  B-2) ,  may  not  apply  to  real  differences  between  the  mean 
values  of  mental  health  outcomes  for  the  Demonstration  and 
Comparison  sites.  Another  assumption  (Appendix  A,  p.  A-5) , 
regarding  the  average  child  improvement  by  0.3  SD,  due  to 
treatment  and  time,  is  only  theoretical  because  it  is  not  based 
on  actual  data. 

As  stated  above,  Monte  Carlo  simulation  should  only  be 
utilized  when  direct  data  analysis  cannot  be  performed  (Gibra, 
1973) which  is  not  the  case  with  the  Fort  Bragg  Evaluation 
Project.  In  addition,  the  real  probability  distributions  of  all 
the  random  variables  under  consideration  must  be  given  (Gibra, 
1973) a  fact  ignored  in  Appendix  C.  The  Monte  Carlo  method 
gives  only  approximations  to  sampling  distributions  (Snedecor 
and  Cochran,  1980) To  this  extent,  the  technique  itself  is 
subject  to  sampling  error. 

Another  observation  about  the  Appendix  C  discussion  was 
that  the  Monte  Carlo  method  was  performed  only  for  one  varicdDle 
(CBCL) ;  no  other  variables  were  used.  The  analysis  might  had 
different  results  if  the  other  variables  were  considered. 
Finally,  before  any  simulation  model  can  be  accepted  it  must  be 
verified  and  validated  to  identify  model  biases  and  erroneous 
assiimptions,  if  any.  The  authors  of  the  modeling  as  reported  in 
Appendix  C  included  no  such  validation. 
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without  the  use  of  actual  data,  the  effect  size  value 
(derived  from  the  meta-analysis  cited  in  Appendix  B)  was  used  to 
calculate  the  power  in  this  report.  This  effect  size  was 
recommended  by  the  staff  of  the  Fort  Bragg  Evaluation  Project. 
Although  not  considered  actual  data,  the  effect  size  allowed  for 
no  additional  bias  to  be  created  by  the  Monte  Carlo  method.  The 
equations  used  to  compute  the  power  of  the  test  of  means  in  this 
report  are  supported  by  numerous  authorities  in  power  analysis 
(Cohen,  1988) 


CONCLUSION 


The  power  values  for  the  directional  tests  computed  in 
this  study  and  the  values  given  in  the  proposal  in  Appendix  A 
are  significantly  different.  Utilizing  information  available  in 
Appendix  A  and  a  methodology  well  supported  in  the  statistical 
literature,  this  study  demonstrates  that  the  Short-Term  Plan 
would  yield  power  exceeding  80%  (80.258%)  at  full  precision, 
instead  of  50%  as  presented  in  Appendix  A.  Even  using  linear 
interpolation,  a  power  of  78%  was  derived.  This  study 
demonstrates  that  it  is  unnecessary  to  extend  the  duration  of 
the  project  based  on  power  requirements;  the  Short-Term  Plan 
should  produce  high  power  to  demonstrate  significance  if  the 
alternative  hypothesis  is  true.  The  Demonstration  sample  size 
no  needed  to  achieve  80%  power  for  the  Short-Term  Plan  (oe  = 

0.05,  nc  =  ISO,  ES  =  0.25)  equals  296  cases. 

Secondly,  because  the  standardized  effect  size  is  a 
computed  variable,  it  can  be  modified.  This  modification  can  be 
achieved  by  any  of  several  methods  currently  available  to  the 
Fort  Bragg  Evaluation  Project  staff  without  any  project 
extension.  Variance  can  be  reduced,  thereby  allowing  a  decrease 
in  sample  size  necessary  to  detect  a  particular  level  of  effect 
size  at  a  specified  power  by  increasing  quality  control  in  data 
collection  and  preparation  for  analysis.  For  example,  each 
outcome  should  be  used  in  as  sensitive  a  form  as  can  be  reliably 
measured  (variable  of  interest  should  always  be  measured  on  a 
continuum,  not  dichotomized) .  Unnecessary  dichotomization 
causes  a  loss  of  power  in  all  analyses.  Consequently,  a  much 
larger  sample  is  necessary  to  achieve  the  same  power. 

Finally,  as  stated  above,  a  more  accurate  estimate  of  the 
Fort  Bragg  Evaluation  Project  effect  size  is  achieved  when 
actual  data  is  utilized  and  a  full  post  hoc  power  analysis  is 
conducted.  The  advisability  of  performing  post  hoc  power 
analysis  is  strongly  supported  by  Cohen  (1988) Rossi 
(1990),^*  Bailar  (1992),”  and  numerous  authorities  on  power 
analysis  in  the  behavioral /medical  sciences. 
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APPENDIX  A 

LETTER  DATED  FEBRUARY  15,  1993,  PROM 
DR.  LENORE  BEHAR  TO  MR.  LEO  SLEIGHT 


Nonh  Carolina  Department  of  Human  Resources 
Division  of  Mental  Health,  Developmental  Disabilities 
and  Substance  Abuse  Services 

325  Nonh  Salisbury  Street  •  Raleigh,  Notth  Carolina  27603  •  Courier  #  56-20-24 

James  B.  Hunt  Jr,  Governor  Michael  S.  Pedneau,  Director 

C  Robin  Britt,  Secretary  (919)  733-701 1 

February  15,  1993 

Mr.  Leo  Sleight 
Central  Contracting  Office 
HSAA-C,  Building  2015 
Department  of  the  Army 

Headquarters,  U.S.  Army  Health  Service-  Command 
Fort  Sam  Houston,  Texas  78234-6000 

Re:  DADA10-89-C-0013 ,  Fort  Bragg  Child/Adolescent  Mental  Health 
Demonstration  Project;  Extension  of  Evaluation  Component. 

Dear  Leo: 

We  have  reviewed  the  status  of  the  Evaluation  Component  of  the 
Fort  Bragg  Child/ Adolescent  Mental  Health  Demonstration  Project 
and  find  that,  in  keeping  with  the  contract  and  with  the 
Vanderbilt  Statement  of  Work,  the  following  reports  will  be 
submitted  by  September  30,  1993: 

1.  Implementation  Study,  Final  Report. 

2.  Quality  Study,  Final  Report. 

3.  Cost  Study,  Interim  Report.  As  explained  in  Attachment  1, 
the  data  to  be  used  for  an  Interim  Report  of  the  Cost  Study 
to  be  submitted  in  September  1993  will  be  for  FY92.  As  this 
report  will  be  prepared  during  the  last  quarter  of  FY93, 

CHAMPUS  data  after  September  1992  would  be  unstable  given 
the  time  lag  between  the  date  of  service  and  the  appearance 
of  those  costs  in  the  data.  Another  reason  for  using  FY92 
cost  data  is.  that  Gateway  cost  data  for  FY93  would  not  be 
available  in  an  analyzable  form  for  a  September  1993  report. 

As  explained  in  Attachment  2,  it  is  not  possible  to  complete  the 
Outcome  Study  with  an  acceptable  level  of  confidence  by  September 
1993.  If  data  were  to  be  collected  using  the  time  frame  proposed 
in  the  short  term  plan,  the  level  of  confidence,  based  on  a 
sophisticated  power  analysis  specific  to  this  type  of  study, 
would  be  at  the  .50  level.  As  you  know,  this  level  of  confidence 
is  comparable  to  flipping  a  coin.  We  believe  instead  that  the 
Outcome  Study  should  be  completed  as  originally  designed  to  yield 
results  at  the  .80  level  of  confidence.  To  achieve  this  goal. 
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Wave  1  data  should  be  collected  through  June  30,  1993  at  the 
Demonstration  site  and  through  December  31,  1993  at  the 
Comparison  sites;  Wave  2  and  Wave  3  data  should  be  collected 
through  June  30,  1994.  Cost  data  specific  to  the  clients  in  the 
study  will  need  to  be  analyzed  through  the  same  time  period  in 
order  to  determine  Cost  Effectiveness.  A  final  report  of  the 
Outcome  Study  and  the  companion  Cost  Effectiveness  Study  will  be 
Issued  in  September  1994 .  Costs  of  extending  this  portion  of  the 
Evaluation  Component  to  completion  are  provided  as  Attachment  3. 

It  does  not  seem  sensible  to  have  inYfisted.  in  the  Outcome  Study 
thus  far  and  terminate  it  short  of  having  adequate  information  to 
reach  a  conclusion  regarding  the  impact  of  the  Demonstration 
Project  on  treatment  outcomes.  I  will  point  out  tnat  no  CHAMPUS 
evaluations  in  the  past  have  addressed  outcome,  but  rather  have 
studied  utilization  and  cost  only.  This  absence  of  outcome  data 
has  been  raised  as  a  deficiency  in  the  evaluation  of  the  CPA 
Norfolk  Demonstration  (Burns,  1993,  Attachment  4).  I  believe 
that  the  opportunity  should  not  be  prematurely  abandoned  to 
determine  whether  or  not  the  methods  of  service  delivery  affect 
treatment  outcomes.  As  we  have  discussed  earlier,  the  delays 
which  resulted  from  the  failure  of  HSC  to  provide  access  to 
necessary  data  during  the  first  two  years  of  the  project  have 
seriously  compromised  the  completion  of  the  Outcome  Study  in  a 
timely  fashion.  During  those  two  years,  I  repeatedly  emphasized 
the  anticipated  costliness  of  HSC's  delays  in  providing  access  to 
data,  so  none  of  us  should  be  surprised  by  the  need  to  extend  the 
Evaluation  Component  at  this  point. 

As  I  noted  in  my  letter  dated  December  16,  1992  (Attachment  5),  I 
have  discussed,  with  the  various  stakeholders,  your  plan  to  end 
the  Evaluation  Component  before  it  is  completed  based  on  your 
belief  that  sufficient  information  exists  to  document  the  success 
of  the  project.  I  believe  that  those  stakeholders  maintain  the 
same -position  now  as  they  did  in  December;  that  is,  that  they 
wish  to  have  unbiased  and  convincing  evidence  regarding  this 
project  and  until  such  data  are  presented  and  accepted,  we  will 
need  to  continue  the  objective  evaluation.  I  believe  that  this 
position  is  sound  considering  the  issues  from  a  scientific 
perspective.  I  trust  you  will  endorse  the  merits  of  this 
position  and  support  the  completion  of  the  Outcome  Study  and  the 
Cost  Effectiveness  Study. 


Sincerely, 


Head,  Child  and  Family  Services  Branch 


cc:  Mr.  James  Newman 
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Attachment  1 


PROPOSED  CONTENTS  OF  COST  STUDY  OF 
THE  SECOND  INTERIM  REPORT  (September  30,  1993) 


The  Cost  Study  portion  of  the  of  the  Fort  Bragg  Evaluation  will  assemble  data  from 
CHAMPUS  records,  Rumbaugh’s  management  information  ^tem  (MIS),  Fort 
Campbell’s  MIS  and  Fort  Stewart’s  medical  records  into  an  integrated  utilization 
database.  In  addition,  unit  cost  measures  will  be  collected  from  each  site,  or 
estimated  where  not  directly  available.  Using  these  data,  Vanderbilt  will  produce  an 
Interim  Report  to  be  submitted  on  September  30,  1993.  In  order  to  minimize  bias 
due  to  start-up  issues  at  the  Demonstration,  the  report  will  be  limited  to  the  FY92 
time  period  (October  1991  -  September  1992).  The  lack  of  stability  of  CHAMPUS 
data  for  the  period  of  time  after  September  1992  precludes  inclusion  of  further  data 
in  the  Interim  Report.  The  Final  Report,  which  will  be  submitted  in  September,  1994 
will  include  subsequent  cost  data  for  all  sites. 

The  Interim  Report  will  provide  a  comparative  analysis  in  tabular  and/or  graphic 
form,  of  the  following: 


Service  Category 

Residential  Services 
Hospital 
RTC 

Group  &  Therapeutic  Home 


Measures  &  Statistics 

$  per  day  per  eligible  child 
$  per  day  per  child  served 
Admissions 
Children  served* 

Length  of  stay* 


Non-Rcsidcnlial  Services 
Day  TX/In-Home 
Outpatient 
Medical  Services 
(meds  &  med  evals) 
Support  Services 
(non-direct  Services, 
e.g.,  Treatment  Team 
activities  &  case  man. 
phone  calls) 


$  per  day  per  eligible  child 
$  per  day  per  child  served 
Admissions 
Length  of  episode* 

Number  of  episode* 


*Mean,  median,  maximum  and  minimum  will  be  presented  for  these  measures. 


cost. rep  (21) 
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Attachment  2 


COMPARISON  OF  TWO  DATA  COLLECTION  SCENARIOS 
USING  POWER  ANALYSIS 

A  power  analysis  was  conducted  to  predict  the  consequences  of  ending  data  collection 
for  the  Evaluation  before  three  waves  of  data  could  be  collected  on  the  targeted 
number  of  clients  that  is  specified  in  the  statement  of  work.  Two  plans  have  been 
discussed  that  differ  in  how  long  data  would  be  collected.  The  objective  of  power 
analysis  is  to  determine  the  number  of  cases  for  which  data  should  be  collected  in 
order  to  determine  the  effectiveness  of  the  Demonstration  on  children’s  mental 
health  outcomes.  The  threat  of  collecting  data  from  too  few  participants  is  that  the 
statistical  analysis  may  indicate  that  results  were  due  to  chance  when,  in  fact,  there 
was  an  undetected  effect.  Obviously  if  the  final  analysis  misses  the  effect  and  tells  us 
only  that  what  we  observed  may  be  due  to  chance,  the  money  and  effort  invested  in 
the  Evaluation  and  the  Demonstration  will  have  been  wasted. 

Power  analysis  is  a  specialized  branch  of  p^hological  statistics  that  calculates  how 
many  subjects  are  needed  to  be  assured  that  the  results  are  not  due  to  chance.  It 
indicates  "...the  probability  that  statistical  significance  will  be  attained  given  there 
really  is  a  treatment  effect"  (Lipsey,  1990,  p.  20).  To  conduct  power  analyses, 
statisticians  must  make  assumptions  before  calculating  the  proposed  study’s  power. 
Those  assumptions  are  discussed  below. 


Data  Collection  Assumptions 

The  power  analyses  presented  here  are  based  on  two  different  data  collection  plans. 

They  arc  as  follows: 

1.  The  short-term  plan  stops  recruitment  (Wave  1)  at  all  sites  on  June  30,  1993, 
for  a  total  of  1065  Wave  1  cases,  and  stops  all  data  collection  for  Waves  2  and 
3  on  September  30,  1993.  This  plan,  after  correction  for  attrition,  would 
include  approximately  299  Demonstration  cases  and  150  Comparison  cases 
with  complete  Wave  3  data. 

2.  The  longer  plan  stops  recruitment  (Wave  1)  at  the  Demonstration  site  on  June 
30,  1993,  and  at  the  Comparison  sites  December  30,  1993,  for  a  total  of  1125 
Wave  1  cases.  In  this  scenario,  all  Wave  2  and  Wave  3  data  collection  would 
end  June  30,  1994.  This  plan,  after  correction  for  attrition,  would  include 
approximately  426  Demonstration  cases  and  361  Comparison  cases  at  Wave  3. 


Clinical  Assumptions 

1.  Many  children  will  improve  in  both  settings,  but  more  children  will  improve  in 
the  Demonstration  because  there  will  be,  on  the  average,  a  belter  fit  between 
the  child  and  his  or  her  treatment. 

2.  Important  improvement  due  to  the  treatment  will  continue  to  accrue  for  at 
least  the  first  year  folowing  the  start  of  treatment. 
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Sialislical  Assuinplions 

1.  Statistical  tests  will  be  run  at  p(a)  =  0.05.  This  means  we  will  follow  the 
scientific  norm  of  being  95%  certain  that  observed  differences  are  not  due  to 
chance. 

2.  The  Evaluation  is  attempting  to  detect  a  difference  of  at  least  0.25  standard 
deviations  (SD)  difference  in  improvement  between  the  average  child  at  the 
Demonstration  site  and  the  average  child  at  the  Comparison  sites.  The  study 
will  not  be  capable  of  effectively  detecting  effects  smaller  than  .25  SD.  This 
effect  size  was  derived  from  a  meta-analysis  on  child  psychotherapy  as  the 
mean  effect  size  found  for  nonbehavioral  treatment  using  instruments  similar 
to  ones  used  in  the  Evaluation  (Lampman,  Durlak  &  Wells,  1992).  A 
difference  of  0.25  SD  is  the  same  as  saying  that  if  50%  of  the  patients  in  the 
Companson  get  better  while  63%  of  those  in  the  Demonstration  will  get 
better. 

3.  Ail  children,  on  the  average,  will  improve  by  03  SD  due  to  treatment  and 
time;  Demonstration  children  will  improve  an  additional  0.25  SD  due  to 
treatment  conditions  unique  to  the  Demonstration. 

4.  The  goal  is  to  determine  only  if  the  children  at  the  Demonstration  site  have 
better  outcomes,  in  general,  than  the  children  at  the  Comparison  sites. 
Separate  analysis  of  important  subgrou{»,  such  as  boys  versus  girls,  or  certain 
diagnoses,  such  as  conduct  disorder  or  depression,  will  be  foregone  because 
too  many  subjects  would  be  needed  to  have  any  assurance  of  having 
interpretable  results  given  the  predicted  effect  size. 

5.  A  powerful  repeated-measures  analysis  of  variance  will  be  conducted, 
improving  precision  by  using  each  subject  as  his/her  own  control,  to  see 
whether  the  Demonstration  group  improves  more  over  lime. 

6.  Since  the  quasi-experimental  design  applied  in  this  Evaluation  is  unique, 
standard  power  curve  tables  could  not  be  used.  Instead,  statistical  modeling 
was  used.  In  this  model,  over  1240  hypothetical  complete  data  sets  were 
computer-generated  according  to  the  statistical  assumptions  stated  above. 
Each  "model"  data  set  was  analyzed  with  a  repeated  measures  variance 
analysis. 
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Results  of  Power  Calculations 

The  power  calculations  produced  the  following  results: 

Plan  Number  of  cases  after  attrition  Statistical  Power 


Short-term  plan 

299  Demonstration 

150  Comparison 

50% 

Lx)nger  plan 

426  Demonstration 

361  Comparison 

80% 

This  result  means  that  there  is  a  50%  chance,  under  the  short-  term  plan,  that  the 
statistical  analysis  will  say  that  the  results  of  the  study  are  due  to  chance,  even  if 
more  children  improve  at  the  Demonstration  site.  Hence,  with  the  sample  included 
under  the  short-term  plan,  the  analyses  will  be  too  insensitive  to  detect  true  results. 


Recommendation 

While  the  short-term  plan  saves  some  money,  it  creates  great  risk  (50%)  that  an 
important  clinical  improvement  will  be  inseparable  from  the  random  effects  of 
chance.  The  longer  plan  will  provide  the  generally  accepted  assurance  (80%)  that  the 
research  will  have  enough  data  to  detect  results  should  they  occur.  It  should  be 
noted,  however,  that  even  with  this  longer  plan,  this  80%  assurance  that  the  effects  of 
the  Demonstration  can  be  detected  leaves  a  20%  chance  that  important  effects  will 
be  overlooked. 


Lampman,  C,  Durlak,  J.,  &  Wells,  A.  (1992).  Statistical  Power  in  Child 

Psychotherapy  Outcome  Research.  Paper  presented  at  the  1992  American 
Psychology  Association  Convention. 

Lipsey,  M.  (1990).  Design  Sensitivity:  Statistical  Power  for  Experimental  Research. 
Newbury  Park,  CA:  Sage. 
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Abstract 


•■•  ';>'1.1  ,!n.-ilvv:^  .:'  :•  ’  .1'.  f.'.i-'i-'.tiw’farv  oalL.'nv?  -/u  >•• -i 
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talcome  me.?r,u.'e  anc  tc-rieral  toftn  ot  if»:'.ilrrv?nt  used  Based  on 
'hese  ;a:a.  tf.o  rsmf.er  nl  suh|.r-cts  necessary  to  attain  h0”o  power 
usma  '.  arious  oijtccme  measures  ana  Irealmenls  was  ^alcijiated 
Tht.'  san.pie  s..'es  ni.-ec-'d  to  ichieve  adequate  power  were  trcm  two 
: ''  SIX  times  creater  th  in  the  actual  numDor  ol  subjects  typically  use-; 
in  previous  cbilo  therapv  studies  These  data  underscore  the  need 
tjr  researchers  ta  atlena  to  power  considerations  wnen  planning 
shiid  therapy  outcome  studies 

Introduction 

Statistical  Power  in  Child  Psychotherapy  Outcome 
Research 

Statistical  cower  is  defined  as  “.-.  the  probabilrty  that  statistical 
significance  wm  be  a''aineG  grven  that  there  really  is  a  treatment 
etfea'  (Lipsey,  1990,  c  20),  In  other  worcs,  power  is  the  prcbabiirby' 
ci  correctly  rejecting  a  false  null  hypothesis.  Tne  likelihood  of 
detecting  a  treatment  erfeci  is  associated  with  many  features  ot  a 
study’s  design,  including  the  group  assignment  procedure,  the 
reliability  of  measures,  tne  fideirfy  with  which  treatment  is  impie- 
~ented  and  cnaractensiics  of  the  samples  and  settings  used, 
twever,  even  if  an  excenmeni  is  demonstrated  to  have  aoequate 
internal,  construct  and  external  validity,  it  may  still  fail  to  be  sensitive 
enough  (statistically  speaking)  lo  detect  a  treatment  outcome. 
Statistical  power  is  related  to  the  statistical  conclusion  validity  u(  a 
study  (Cook  ans  Camcceil.  1 97),  and  the  crtica!  lactcr  here  is 
sampile  size 

A  numicer  ot  power  sun, eys  have  been  ccncucled  ol  various 
psycnoiogical  literatures  including  Conen's  seminal  (1962)  caper  on 
statistical  power  m  abnormal  and  social  psycnciogtcal  research  In 
general,  these  studies  r-ave  demonstrated  that  psychological 
researeners  often  design,  conduct  and  publish  data  based  on 
studies  with  inadequate  power  (Chase  and  Chase,  1976;  Cohen. 
1962;  Holmes.  1979:  Rossi.  1990)  Most  of  these  reviewers  have 
admonished  researchers  Ic  address  the  issue  cl  statistical  power  in 
the  planning  phases  ol  research  rather  than  as  a  post  hoc  explana 
I'on  lor  tindings  lailino  to  support  a  treatment's  etiectiveness 
Oespite  tne  usefulness  of  the  power  reviews,  s  appears  the  statisti 
sal  conclusion  validity  ot  studies  publ.shed  ir,  even  the  most  presti- 
reus  journals  has  not  ir^.proved  over  the  past  several  decades 
(Sedlmeier  and  Giaerenzer,  1989) 


One  reason  that  power  salculalions  are  not  routine  proredu; 
design  ol  studies  may  ce  that  an  estimate  ct  tne  e-xpe'tecj  tr 
erfect  IS  needed  lo  compute  power,  .alonn  with  s.imcie  size, 
rated  probability  level  and  diroclionairiy  ol  ihe  lesi  (one  vs  ' 
■ailed)  'he  power  surveys  dosenbed  above  Ivpicaiiv  use  ; 
samolo  size  lo  determine  the  probability  lhat  it  could  -eie--- 
•nedrjm  and  larce  treatment  effects  isce  (Toner.  i962' 

‘‘ere  IS  i.n.s!  the  true  population  orfe-;i  sizes  .‘im  ...nunovvr-  ; 
d  thus  it  IS  difficult  to  design  a  study  wnh  an  adeauaie.  1 1 
dess.ve  sample  size  7he  s.zo  o;  a  s.irTip;e  -i-e  de-.j  to  .i..,; 
teasonaple.  say  80  percent,  power  level  differs  enormously 
s.mali  (e  d  .  .  01  and  large  (e  q  .  701  etfecis  z.  r -tenti.i;  s.-. 
'■  s  pr.spiem  iS  rr,e!a-.inalvsis  —  wnich  h.is  .an  iner. 

P  ’Pul.ai  iechnique  t.ji  summanz'nq  'he  lindmrm.  •■'m  a  ••'-i.. 
t  •'  '.'i! '  jr»*  •  •  p'  1  ’’f'nrn  '  j' ,  i*  n  -•  •• 


’OS  ;n  *r'* 


'  cprji.^tion  efieci  cize  ’han  "irr:  cnai  non  ouanidairve  rev  ews 
'•■'■.bric  ot  .1  rTikrM  xinaiy'.  'u  o  10  rr  to  reoea'c’"*:-'" 

:  .ir.nino 

cufi/oc-j'  o‘  caP'ff  '0  tjo#?  ot  an  6ji'lcr.ir, o 

.inaiy^iC  ri-viow  01  the  C';;;C  pPyC"  ^hen'^oy  iiierature  lo  creoe"': 
reso.dfchers  wnh  csetoi  r’c^n^a*  '-n  -cf  tno  oesfon  of  ■''nno  *’^e' -r  . 


Method 

Meta-.analylic  Procodures 

>'•  total  cl  26  7  child  psvcho'neracv  outcome  studies  were  '•eview'- 
.nS7  were  journal  articles,  l  -  were  book  chapters  and  66  were 
unpublished  dissenations.  Sluc.es  eligible  lor  review  consistec  c 
reports  acoearing  Ihrcuch  the  end  of  1983  in  which  some  term  s' 
psycnoiherapy  for  maladapting  children  (age  <  13)  was  comoare. 
wtin  a  control  group. 

Separate  elect  sizes  lESs)  were  calculated  for  each  cl  six  cate; 
ries  ct  outcome  measures:  cenavicrai  observations,  peer 
socicmetncs,  measures  ol  academic  acnievement  (stancarcize'd 
test  scores  or  school  craoes),  nonacademic  peflormance  meas..: 
(e.g  ,  measures  ol  interpersonal  problem  solving  skills  ar.c  cccnit 
tempo)  an  :  both  normea  anc  non-normeo  rating  scales  a."d 
checklists  ..nitially  1237  ESs  were  calculated,  however,  effects 
wtihin  the  s.dme  outcome  category  and  same  type  of  treatment  w- 
averaaeo  wthin  each  stuCv.  resulting  in  a  total  of  656  ESs  that  W’ 
used  in  arayses. 

Power  Analyses 


Results  end  Discussion 
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The  results  of  the  meta-analysis  indicate  that  the  type  of  measure 
used  to  assess  outcome  and  the  general  type  of  treatment  con¬ 
ducted  interact  to  moderate  the  impart  of  child  psychotherapy.  In 
fact,  twelve  distirKt  clusters  of  studies  were  found,  with  widely 
vaiying  effect  sizes,  suggesting  that  the  interpretation  of  an  overall 
affect  size  for  child  psychotherapy  would  be  misleading.  The 
ssociated  sample  sizes  and  power  calculations  suggest  that  it 
would  be  prudent  for  researchers  planning  child  psychotherapy 
outcome  studies  to  think  carefully  about  the  selection  of  outcome 
measures,  as  they  ^ear  to  differ  in  the  ability  to  detect  the 
effectiveness  of  various  general  types  of  treatment.  For  exanple,  if 
an  investigator  is  assessing  the  effects  of  nonbehavioral  treatment 
using  peer  sociometric  outcome  measures,  a  sarrple  size  of  199 
subjects  per  group  is  needed  to  attain  80%  power.  This  estimate  is 
based  on  a  one-tailed  test,  alpha=.05  to  test  the  difference  between 


a  treatment  and  control  group.  This  estimate  is  quite  liberal  for 
several  reasons.  First,  it  assumes  that  the  treatment  group  would 
outperform  the  control  group;  a  two-tailed  test  would  require  even 
more  subjects  per  group.  Second,  treatment  versus  treatment 
comparisons  have  been  found  to  yield  significantly  smaller  effects 
than  treatment  versus  control  comparisons  (Kazdin  &  Bass.  1989). 
Finally,  the  sample  sizes  necessary  to  achieve  80%  power  also 
increase  as  alpha  decreases. 

The  large  discrepancies  between  sample  sizes  actually  used  and 
those  necessary  to  attain  an  acceptable  level  of  power  in  the 
studies  shown  in  Table  1  nruke  it  difficult  to  assess  how  closely  the 
obtained  treatment  effect  sizes  represent  true  population  effects. 
This,  h  turn  underscores  the  need  for  researchers  to  attend  to 
power  considerations  when  planning  therapy  outcome  studies. 


Table  1 

Mean  effect  tizee*,  mean  sample  sizes,  and  sample  azes  necessary  to 
achieve  acceptable  power  **  for  twelve  homogeneous***  subgroups  of 
child  psychotherapy  studies. 


Type  of  OufcoRM  Msasurs 
Behavioral  Observation 

Peer  Soclometrics 

Normed  Rating  Scales 

Non-normed  Rating  Scales 

Achlevsment  Measurss 

Psrformance  Measurss 


Behavioral  TfsaUnsm 

MMiiESWItSSm 

MswiNpHtnu^lSJ 

NtoSOXpOMndO 

MeuiES^^tin 

MsaeNpargnu^lSJ 

NtoraKpoMBdi 

MamESi0.r7(e) 

MaanNpugm^tU 

Ntea0%petsrf7 

UsmESWISZIM) 

MaanNpsrgnsHIJD 

MIoriOXpo—aa 

MaanESWUSpi 
Maan  N  fie^aKS 
NtoMKpmaiaSZ 

UaanE&OLSrm 

UaanNpargm^ltJ 

NIorKKpaeana 


Ncnbahavioral  Traalmenl 

MaanESWSZSpq 

MaaeWparsmiaa^S 

NIoraOSpoMratSe 

MaaaESaOZSia) 

UauiNpartluSraZajS 

NlaraQKaasfb>tM 

MsMESWlZatSI) 
Maao  N  par  grai^OljO 
NlarW>%po<nr^tS 

Maa«ESW>.1«(B<) 
MaariHpararoup  tij 
NIertKpoeanOU 

MaasESWLItm 
Maui  W  par  ipasp  71.2 
NtorSKpoMrdSl 

MaaaESW).a3|Z1) 

Mean  N  par  group  uZS.! 
N  lor  ea%  pouariM 


all  eflect  sizes  dllarsd  aigniacartfy  irom  zero  C?  <  JOIk  n  Of  studes  ki  paremhasas 
e^ihau  .05,  ona^aled  lest,  nb  per  group 

each  subgmupeehlswodwIhingraiphoinogeniliraloBact  sizes  (p  <  J)1) 
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Power  Analysis 

Power  analysis  is  done  when  planning  a  snidy  in  order  to  determine  how  many 
subjects  are  needed  for  adequate  statistical  power.  Power  is  the  ability  to  detect  differences 
when  they  actually  occur.  A  power  analysis  is  not  normally  done  by  analyzing  real  data 
because  it  is  not  legitimate  to  look  at  data  and  then  stop  gathering  when  the  desired  results 
occur.  This  is  true  because  standard  statistical  tests  all  assume  that  the  test  is  done  once  then 
reported:  to  use  non-standard  procedures  with  these  tests  would  seriously  hurt  their  accuracy. 

The  Monte  Carlo  power  analysis  was  based  on  a  simplification  of  the  DMM  analysis 
described  above,  viz.  univariate  repeated  measures  analysis.  This  simpler  analysis  is  more 
powerful  (requires  fewer  subjects)  than  the  doubly  multivariate  analysis  because  fewer 
variables  are  used  and  also  because  fewer  parameters  had  to  be  estimated  by  the  simpler 
analysis.  (The  ANOVA  assumes  correlations  are  uniform;  the  MANOVA  estimates  them.) 

In  comparing  the  power  of  this  simple  repeated  measures  design  to  univariate 
repeated  MANOVA,  we  found  that  MANOVA  costs  about  a  5  %  loss  of  power  (or  roughly 
1(X)  more  cases).  Thus  the  single  variable  repeated  measures  ANOVA  is  a  conservative  test, 
telling  us  we  need  fewer  subjects  than  we  actually  need  for  adequate  statistical  power  (80% 
chance  of  finding  an  effect  given  an  effect  exists). 

Using  computer  generated  data  surely  would  sound  strange  to  the  nonstatistician,  but 
"Monte  Carlo"  simulations  are  the  standard  method  used  by  statisticians  to  test  statistical 
ideas  when  the  problem  is  too  complicated  to  describe  by  purely  theoretical  equations.  In  the 
Monte  Carlo  power  analysis,  computer  generated  data  was  examined  to  make  sure  the  mean, 
standard  deviation,  and  the  cross-wave  correlations  were  correct.  Then  the  "data"  were 
analyzed  in  repeated  measures  ANOVA  or  univariate  MANOVA.  By  repeating  this  process 
hundreds  of  times  and  then  keeping  score  on  the  results  we  could  see  what  actually 
happened  when  we  analyze  data  like  the  Ft.  Bragg  Demonstration’s.  Trying  the  analysis 
with  varying  numbers  of  "subjects"  permits  us  find  out  how  many  subjects  were  needed  for 
80%  power.  If  we  peeked  prematurely  at  the  real  data  in  order  to  decide  when  we  had 
enough  subjects,  we  would  have  ruined  the  chance  to  use  standard  statistical  estimates  in  the 
way  that  they  were  designed. 
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Demo  vs.  Comparison 

Comp 

Demo 

HEAN 

STO 

N 

HEAN 

STO 

N 

Wave  1  C8CI 
score 

66.12 

10.00 

30000 

66.05 

10.00 

50000 

Wave  2  CBCL 
score  t 

64.48 

10.01 

30000 

63.19 

10.00 

50000 

Wave  3  CBCL 
score 

62.95 

9.37 

30000 

60.52 

9.46 

50000 

Basic  data 

CORRELATION  ANALYSIS 
3  'VAR'  Variables: 


12:53  Wednesday,  April  7,  1 


%L  CgCL  h  a. 
c^s'O  s'D/0  -(tTYnom-c^ 

S\UJma  .  (X>irs  oM. 


CBCLXIJ1  CBCLXIJ2  CBCLXIJ3 
Simple  Statistics 


Variable 

CBCLX1J1 

CBCLXIJ2 

CBCLX1J3 


Variable 

CBCLXIJl 

CBCLXIJ2 

CBCLX1J3 


80000 

80000 

80000 


Hean 

66.0775 

63.6737 

61.4283 


Std  Oev 

9.9994 

10.0209 

9.4999 


Sun 

5286198 

5093899 

4914260 


Simple  Statistics 
Hinimun  Haximun 


^  (o.O 

63.0 


23.4102 

21.0078 

21.6016 


113.2344 

104.3438 

100.9531 


Label 

Wave  1  CBCL  score 
Wave  2  CBCL  score 
Wave  3  CBCL  score 


(ff  (p 

(>0 


Pearson  Correlation  Coefficients  /  Prob  >  |R|  under  Ho-  Rho^O 
/  N  =  80000  ' 


:  ^6. 


CBCLX1J1 

Wave  1  CBCL  score 
CBCLXU2 

Wave  2  CBCL  score 
CBCLXIJ3 

Wave  3  CBCL  score 
Basic  data 

UNIVARIATE  PROCEDURE 
Variable=CBCLXIJ1 


CBCLXIJl 

1.00000 

O.E+00 

0.49971 

o.e-^oo 

0.26472 

O.E+00 


Wave  1  CBCL  score 


CBCLXIJ2 

0.49971 

O.E+00 

1.00000 

O.E+OO 

0.42365 

O.E+00 


CBCLXIJ3 

0.26472 

O.E+00 

0.42365 

O.E+00 

1.00000 

O.E+00 


12:53  Uednesday,  April  7,  1W3 


-flM 

[c* 

Sb 


Moments 


N 

80000 

Sun  Ugts 

80000 

Mean 

66.07747 

Sun 

5286198 

Std  Dev 

9.999381 

Variance 

99.98762 

Skewness 

-0.00057 

Kurtos  i  s 

-0.0047 

USS 

3.573E8 

CSS 

7998910 

CV 

15.13281 

Std  Mean 

0.035353 

T :Mean=0 

1869.069 

Prob> It  1 

O.E-rOO 

Sgn  Rank 

1.6E9 

Prob> 1 S 1 

0.E*00 

Num  ■=  0 

80000 

Ouanti les{0ef=5) 

100%  Max 

113.2344 

997. 

89.39063 

75X  03 

72.8125 

957. 

82.39844 

50X  Med 

66.125 

90% 

78.82813 

25X  01 

59.3125 

107. 

53.20313 

07.  Min 

23.41016 

57. 

49.53125 

17. 

43.02344 

Range 

89.82422 

03-01 

13.5 

Mode 

66.76563 

Extremes 

Lowest 

Obs 

Highest 

Obs 

23.41016( 

74781) 

103.68751 

43762) 

25.33594( 

55893) 

105. 1875( 

33112) 

26. 269531 

70713) 

105.65631 

79624 ) 

28. 035161 

54923) 

106.875( 

33460) 

28. 664061 

27988) 

113.2344( 

28944) 

''VKl 

oJrjJ: 
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.Basic  data  12:53  Wednesday,  April 

UNIVARIATE  PROCEDURE 

Variable=CBCLXIJ2  Wave  2  CBCL  score 


Moments 


H 

80000 

Sum  Ugts 

80000 

Mean 

63.67374 

Sun 

5093899 

Std  Dev 

10.02092 

Variance 

100.4189 

Skewness 

-0.00888 

Kurtosis 

0.000346 

USS 

3.3238E8 

CSS 

8033413 

CV 

15.73792 

Std  Mean 

0.035429 

T :Mean=0 

1797.205 

Prob> It  1 

O.E+00 

Sgn  Rank 

1.6E9 

Prob> 1 S 1 

O.E^^OO 

Nun  ■=  0 

80000 

Quanti les(Def=S) 

100X  Max 

104.3438 

99X 

87.10938 

'75X  03 

70.4375 

95X 

80.17188 

SOX  Med 

63.72656 

90X 

76.51563 

2SX  01 

56.94531 

10X 

50.80469 

OX  Min 

21.00781 

5X 

47.1875 

IX 

40.21875 

Range 

83.33594 

03-01 

13.49219 

Mode  66.17188 


Lowest 

Extremes 

Obs  Highest 

Obs 

21. 007811 

27417)  102.2188< 

24285) 

21.67969{ 

41013)  102.2188( 

61906) 

23.29688( 

61119)  102.6S63( 

9283) 

23.60547( 

54923)  103.1719{ 

78443) 

23.98828( 

39869)  104.3438( 

14973) 

Basic  data 

UNIVARIATE  PROCEDURE 

12:53  Wednesday,  April  7 

Variable=CBCLXIJ3  Uave  3  CBCL  score 


Moments 


N 

80000 

Sun  Wgts 

80000 

Mean 

61.42825 

Sun 

4914260 

Std  Dev 

9.499866 

Variance 

90.24745 

Skewness 

-0.01171 

Kurtosis 

-0.01007 

USS 

3.0909E8 

CSS 

7219706 

CV 

15.46498 

Std  Mean 

0.033587 

T iMeansO 

1828.924 

Prob>|T| 

O.E+00 

Sgn  Rank 

1.6E9 

Prob>  S 

O.E-'^OO 

Nun  ■=  0 

80000 

1  1 

Quanti les<0ef=5) 

100X  Max 

100.9531 

99X 

83.52344 

75X  Q3 

67.89063 

95X 

76.98438 

SOX  Med 

61.42188 

90X 

73.60938 

2SX  01 

55 

10X 

49.26563 

OX  Min 

21.60156 

5X 

45.78516 

IX 

39.20703 

Range 

79.35156 

03-01 

12.89062 

Mode 

64.20313 

Extremes 


Lowest 

21.60156( 

23.4«531( 

25.29297( 

25.29688( 

26.17188( 


Obs  Highest 
16801)  96.A5313( 
4018)  97.20313( 
42150)  98.20313( 
35497)  99.15625{ 
63423)  100.9531 ( 


Basic  data 


Obs 

78827) 

77062) 

2814) 

44423) 

2098) 


General  Linear  Models  Procedure 
Class  Level  Information 


12:53  Wednesday,  April  7, 


Class  Levels  Values 

2  Comp  Demo 

NuFtoer  of  observations  in  data  set  =  80000 


7,  1993 


,  1993 


1993 


C-5 


Basic  data 


ti.'  i.  v-'.*' ;■  ?  k  L-i  >  :  ■  i.  ■  t 

12:53  Wednesday,  April  7,  1993 

General  Linear  Models  Procedure 
Repeated  Measures  Analysis  of  Variance 
Repeated  Measures  Level  Information 

Dependent  Variable  CBCLXtJI  CBCLX1J2  CBCLX1J3 

Level  of  WAVE  1  2  3 

Basic  data  12:53  Wednesday,  April  7,  1993 

General  Linear  Models  Procedure 

RepeaAd  Measures  Analysis  of  Variance 

Tests  of  Hypotheses  for  Between  Subjects  Effects 

Source  Of  Type  III  SS  Mean  Square  F  Value  Pr  >  F 


13823067 


Basic  data 


12:53  Wednesday,  April  7,  1993 


General  Linear  Models  Procedure 

Repeated  Measures  Analysis  of  Variance 

Univariate  Tests  of  Hypotheses  for  Within  Subject  Effects 


Source:  WAVE 

DF  Type  MI  SS 
2  711529.5857 

Source:  WAVE*SITE 


Adj  Pr  >  F 

Mean  Square  F  Value  Pr>F  G-G  H-F 
355764.7928  6129.16  O.E+00  O.E+00  O.C+00 


Type  1 1 1 
51927.2 


Mean  Square 
25963.6450 


F  Ve'--e 
447.  • 


Adj  Pr  >  F  \ 
e  Pr>F  G-G  H-Fl 
2E-194  4E-187  4E-187  J 


Source:  Error(WAVE) 


DF  Type  III  SS  Mean  Square 
159996  9286901.3432  58.0446 

Greenhouse-Ceisser  Epsilon  s  0.9614 
Huynh-Feldt  Epsilon  =  0.9615 

-NOTE:  Copyright(c)  1985,86,87  SAS  Institute  Inc,,  Cary,  NC  27512-8000,  U.S.A. 
NOTE:  SAS  (r)  Proprietary  Software  Release  6.04 

Licensed  to  VANDERBILT  UNIVERSITY,  Site  11765001. 

NOTE:  AUTOEXEC  processing  completed. 

1 

2 

3  *  Set  numbers  prior  to  running  a  problem  ^  m  I  i 

5  Xlet  NOemo  =  50000;Xlet  NComp  =  30000;  ' 

6  Xlet  VarOemo  =  -0.25  ; 

7  Xlet  vartine  *  -0.30; 

8  data  ALL  lieslkeep  =  SITE  CSCLxijI  C8CLxij2  CBCLxij3); 

10  sqrt2  =  sqrt(2.0); 

11 

12  •  scores; 

13  /• 

14  varwithin  variance  within  subjects 

15  varandnl  random  err.  on  measurement  1 

16  varandn2  random  err.  on  measurement  2 

17  varandm3  random  err.  on  measurement  3 

18  rannor(seed)  sas  random  function  mean  =0,  sd  =  1 

19  CBCLxij3  =  CBCL  score,  Xsub  ij  on  third  occasion 

20  =  half  of  (variance  within  ♦  random3)  ♦  effect  of  ti 

21  ♦  effect  of  demonstration 

22  V 

23 

24  do  i  z  1  to  tNDemo  by  1;  \  I  ) i 

25  SITE  =  "Demo";  /  L/e 

26  varwithn  =  rannor(O);  L 

27  varandml  =  rannor(O);  T 

28  varandm2  =  rannorfO);  ) 

29  varandm3  =  rannor(O); 

30 

31  CBCLxijI  =  (varwithn  ♦  varandml )/sqrt2; 

32  CBCLxij2  =  (varwithn  ♦  varandm2)/sqrt2  ♦  4Vartime/2  ♦ 

SVarOemo/2; 

33  CBCLxij3  =  (0.5*varwithn  ♦  0.3*varanctii2  ♦  1 .20*varandn3)/sqrt 


•make  scores; 


varwithin  variance  within  subjects 
varandnl  random  err.  on  measurement  1 
varanda2  random  err.  on  measurement  2 
varandii3  random  err.  on  measurement  3 
rannor(seed}  sas  random  function  mean  =0,  sd  =  1 
CBCLxij3  =  CBCL  score,  Xsub  ij  on  third  occasion 

=  half  of  (variance  within  ♦  randont3)  ♦  effect  of  time 
♦  effect  of  demonstration 


do  i  z  1  to  tNDemo  by  1; 

SITE  =  "Demo"; 
varwithn  =  rannor(O); 

varandml  =  rannor(O);  I 

varandm2  =  rannorfO); 

varandm3  =  rannor(O); 

CBCLxijI  =  (varwithn  ♦  varandml )/sqrt2; 

CBCLxij2  =  (varwithn  ♦  varandm2)/sqrt2  ♦  tVarTime/2  ♦ 


CBCLxij3  = 


=  (0.5*varwtthn  ♦  0.3*varanctii2  ♦  1 .20*varandn3)/sqrt2 
•  tVarT ime  •  tVarDemo; 


CBCLxijI 
CBCLxi j2 
CBCLxi j3 
output; 
end; 


do  i  <  1  to  tNComp  by  1; 
SITE  •  "Conp"; 


(10.0  •  CBCLxihl)  ♦  66.0 
(10.0  •  CBCLxifjE)  ♦  66.0 
(10.0  •  CBCLxi/}3)  ♦  66.0 


/*  mean  66,  Sd  10  */ 


iPEMO  (  C-6  cl^ 


CfOss  coyvelxn/M^- 


u 

45 

46 

47 

48 

49 

50 

51 

52 

53 

54 

55 

56 

57 

58 

59 

60 
61 
62 

63 

64 

65 
NOTE: 
NOTE: 

66 

67 

68 

69 

70 

71 

72 

73 

74 
NOTE: 

75 

76 

77 

78 
NOTE: 

79 

80 
81 
82 

NOTE: 

83 

84 

85 

86 
NOTE: 

87 

88 
89 

NOTE: 


varwithn  =  rannor(O); 
varandml  =  rannor(O); 
vararvitn2  =  rannor(O) 
varandnv3  =  rannor(O) 


CBCLxijI  =  (varwithn  *  varandml )/sqrt2; 

CBCLxii2  =  (varwithn  +  varandm2)/sqrt2  +  &VarTime/2  ; 

CBCLxij3  =  (0.5*varwithn  *  0.3*varandn2  ♦  1.20*varandm3)/sqrt2 


CBCLxij3  =  (0 

&VarT ime 

CBCLxijI  =  (10.0  •  CBCLxjjl)  ♦66.0 
CBCLxii2  =  (10.0  •  CBCLx\j2)  ♦  66.0 
CBCLxij3  =  (10.0  *  CBCLxto)  ♦  66.0 
output; 
end; 


/*  mean  66,  Sd  10  •/ 


fjo 


attrib  SITE 

CBCLxi j1 
CBCLxi I 2 
CBCLxi j3 

run; 


label 

format  =5.1  length  =  3  label 
format  =  5.1  length  =  3  label 
format  =  5.1  length  =  3  label 


=  'Oemo  vs.  Comparison' 
=  ‘Wave  1  CBCL  score' 

=  'Wave  2  CBCL  score' 

=  'Wave  3  CBCL  score'; 


0); 


The  data  set  WORK. ALL  LIES  has  80000  observatiohs  and  4  variables. 
The  DATA  statement  used  2.45  minutes. 

options  linesize  =  72; 
proc  tabulate  f  =  6.2  data  =  ALL  lies; 
class  SITE: 

var  CBCLxijI  CBCLxi J2  CBCLxi i3; 
table  (CBCLxijI  CBCLxij2  CBCLxij3), 

(SITE)*(mean*f=6.2  std*f=6.2  .N*f=6. 
title  'Basic  data'; 

run; 

The  PROCEDURE  TABULATE  used  1.02  minutes, 
proc  corr; 

var  cbclxijl  --  cbclxij3; 
proc  univariate; 

The  PROCEDURE  CORR  used  34.00  seconds, 
var  cbclxijl  --  cbclxijS; 

options  linesize  =  72; 
proc  glm  ; 

The  PROCEDURE  UNIVARIATE  used  2.15  minutes, 
classes  SITE; 

model  CBCLxijI  CBCLxij2  CBCLxi j3  =  SITE/nouni; 
repeated  wave  3/nom; 
run;  quit; 

The  PROCEDURE  GLM  used  2.57  minutes. 


Ai«- 


endsas; 

SAS  Institute  Inc.,  SAS  Circle,  PO  Box  8000,  Cary,  MC  27512-8000 
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The  University  of  Texas 
Health  Science  Center  at  Houston 


SCHOOL  OF  PUBLIC  HEALTH 
Health  Services  Organization 


1200  Herman  Pressler 
P  O.  Box  20186 
Houston,  Texas  77225 
(713)  792-4372 
(713)  792-4471 


May  10,  1993 


Edward  D.  Martin,  M.D. 

Assistant  Secretary  of  Defense  (Health  Affairs) 
The  Pentagon,  Washington  DC 


Dear  Dr.  Martin: 

I  have  now  completed  my  review  of  the  materials  submitted  to 
me  on  April  23,  1993,  by  Dr.  Scott  Optenberg. 

In  the  absence  of  information  on  several  key  factors  relevant 
to  the  successful  execution  of  a  project  of  this  magnitude,  it  is 
indeed  impossible  to  conduct  an  objective  evaluation  of  all  the 
claims  of  the  investigators  for  the  Fort  Bragg  Demonstration 
project.  I  will  therefore  limit'  my  comments  to  the  power  analysis 
performed  by  the  Army  Statisticians  in  an  in  house  effort  to 
determine  if  the  demonstration  project  should  be  extended. 

The  investigators  at  Fort  Bragg  are  interested  in  detecting  a 
standardized  difference  of  .25  between  the  experimental  and  control 
subjects  for  the  short  tern  plan.  They  anticipate  299 
demonstration  and  150  control  cases  at  wave  3 .  As  demonstrated  by 
the  detailed  power  analysis  developed  for  this  purpose  by  Dr. 
Optenberg 's  group,  no  matter  what  assumptions  are  made  on  the 
variances  of  the  two  populations,  the  minimum  power  that  may  be 
attained  at  wave  3  of  the  analysis  is  about  81%.  The  derivation  of 
the  power  analysis  is  based  on  the  theoretical  developments 
presented  in  Cohen's  (1988)  book  which  is  regarded  as  the  basic 
text  on  power  analysis  in  behavioral  sciences. 

Similarly,  using  the  anticipated  number  of  cases  at  the  end  of 
(wave  3)  the  long  term  plan,  (i.e.  426  demonstration  and  361 
comparison  cases)  a  power  of  at  least  90%  will  be  obtained. 

In  the  Fort  Bragg  demonstration  project,  a  power  of  .80  for 
detection  of  a  relatively  small  difference  (i.e.  .25  SD)  in 
improvement  between  subjects  in  the  experimental  and  control  groups 
is  very  impressive  considering  that  most  research  studies  in  social 
sciences  are  under  powered  (power  <.80)  for  detecting  anything  but 
large  differences  (Lipsey  1990) .  Thus,  the  short-term  plan  is  more 
than  sufficient  to  meet  the  objectives  of  this  demonstration 
project. 
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The  investigators  justification  for  a  long  tern  plan  is  based 
on  the  argument  that  if  only  the  short  term  plan  were  to  be  carried 
out,  the  likelihood  of  detecting  a  statistical  significance  in  the 
presence  of  a  treatment  effect  would  be  50%.  This  claim  has  not 
been  demonstrated  mathematically  by  the  investigators  and  as  shown 
by  the  power  analysis  performed  by  the  army  statisticians  using 
appropriate  statistical  procedures  is  in  serious  error. 

On  reviewing  the  documentation  dated  April  30,  1993  from 
Vanderbilt  University  (received  by  me  on  5/8/93) ,  some 
inconsistency  in  the  claims  of  the  Investigators/Evaluators  of  the 
demonstration  project  is  apparent.  On  page  4  of  the  above 
document,  they  state  that  the  project  has  been  losing  about  15%  of 
the  subjects  per  wave.  Using  this  attrition  rate  the  1065  wave  1 
cases  (demonstration  plus  control)  should  result  in  1065  (.85)  (.85) 
or  769  cases.  Yet  under  data  collection  assumptions  the  1065  wave 
1  cases  will  result  in  only  449  (299  demonstration  and  150  control) 
cases.  Therefore,  the  statistical  power  under  the  proposed  short¬ 
term  plan  may  be  even  higher  than  81%. 

Furthermore,  investigators  have  repeatedly  mentioned  not 
wanting  to  "peek  prematurely"  at  the  real  data  for  fear  of  "ruining 
the  chance  to  use  standard  statistical  estimates  in  the  way  that 
they  were  designed".  To  obtain  the  power  associated  with  a  study 
on  treatment  effectiveness,  all  one  needs  is  some  assumption  on  the 
variance  of  the  two  treatment  outcomes  (in  this  study  demonstration 
and  control  cases) ,  the  number  of  individuals  in  each  group,  the 
effect  size  and  the  level  of  significance.  Power  calculation  does 
not  require  a  "peek"  at  the  actual  data.  Hence  the  use  of  Monte 
Carlo  simulation  to  estimate  the  power  of  the  study  is  unnecessary 
and  irrelevant. 

If  I  may  be  of  further  help,  please  feel  free  to  call  me  at 
(713)  792-4472. 

Sincerely, 

/vlok,  S  <V. 

Asha  S.  Kapadia 
Professor  and  Convener 
of  Biometry 

ASK:rf 
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